Updating documents | Manticore Search Manual

Index rotation is a procedure in which the searchd server looks upon new versions of defined indexes in the configuration. Rotation is subject only to Plain mode of operation.

There can be two cases:

for plain indexes that are already loaded
indexes added in configuration, but not loaded yet

In the first case, indexer cannot put the new version of the index online as the running copy is locked and loaded by searchd. In this case indexer needs to be called with --rotate parameter. If rotate is used, indexes creates new index files with .new. in their name and sends a HUP signal to searchd informing it about the new version. The searchd will perform a lookup and will put in place the new version of the index and discard the old one. In some cases it might be desired to create the new version of the index but not perform the rotate as soon as possible. For example it might be desired to check first the health of the new index versions. In this case, indexer can accept --nohup parameter which will forbid sending the HUP signal to the server.

New indexes can be loaded by rotation, however the regular handling of HUP signal is to check for new indexes only if configuration has changed since server startup. If the index was already defined in the configuration, the index should be first created by running indexer without rotation and perform RELOAD INDEXES statement instead.

There are also two specialized statements can be used to perform rotations on indexes:

RELOAD INDEX idx [ FROM '/path/to/index_files' ];

RELOAD INDEX allows you to rotate indexes using SQL.

It has two modes of operation. First one (without specifying a path) makes Manticore server check for new index files in directory specified in path. New index files must have a idx.new.sp? names.

And if you additionally specify a path, server will look for index files in specified directory, move them to index path, rename from index_files.sp? to idx.new.sp? and rotate them.

mysql> RELOAD INDEX plain_index;
mysql> RELOAD INDEX plain_index FROM '/home/mighty/new_index_files';

RELOAD INDEXES;

Works same as system HUP signal. Initiates index rotation. Unlike regular HUP signalling (which can come from kill or indexer ), the statement forces lookup on possible indexes to rotate even if the configuration has no changes since the startup of the server.

Depending on the value of seamless_rotate setting, new queries might be shortly stalled; clients will receive temporary errors. Command is non-blocking (i.e., returns immediately).

mysql> RELOAD INDEXES;
Query OK, 0 rows affected (0.01 sec)

The rotate assumes old index version is discarded and new index version is loaded and replace the existing one. During this swapping, the server needs also to serve incoming queries made on the index that is going to be updated. To not have stalls of the queries, the server implements by default a seamless rotate of the index as described below.

Indexes may contain some data that needs to be precached in RAM. At the moment, .spa, .spb, .spi and .spm files are fully precached (they contain attribute data, blob attribute data, keyword index and killed row map, respectively.) Without seamless rotate, rotating an index tries to use as little RAM as possible and works as follows:

new queries are temporarily rejected (with "retry" error code);
searchd waits for all currently running queries to finish;
old index is deallocated and its files are renamed;
new index files are renamed and required RAM is allocated;
new index attribute and dictionary data is preloaded to RAM;
searchd resumes serving queries from new index.

However, if there's a lot of attribute or dictionary data, then preloading step could take noticeable time - up to several minutes in case of preloading 1-5+ GB files.

With seamless rotate enabled, rotation works as follows:

new index RAM storage is allocated
new index attribute and dictionary data is asynchronously preloaded to RAM
on success, old index is deallocated and both indexes' files are renamed
on failure, new index is deallocated
at any given moment, queries are served either from old or new index copy

Seamless rotate comes at the cost of higher peak memory usage during the rotation (because both old and new copies of .spa/.spb/.spi/.spm data need to be in RAM while preloading new copy). Average usage stays the same.

Example:

seamless_rotate = 1

REPLACE vs UPDATE

Last modified: February 10, 2021

✔ ️Updating documents

You can change existing data in an RT or PQ index by either updating or replacing it.

UPDATE replaces row-wise attribute values of existing documents with new values. Full-text fields and columnar attributes cannot be updated. If you need to change the content of a full-text field or columnar attributes, use REPLACE.

REPLACE works similar to INSERT except that if an old document has the same ID as the new document, the old document is marked as deleted before the new document is inserted. Note that the old document does not get physically deleted from the index. The deletion can only happen when chunks are merged in an index, e.g. as a result of an OPTIMIZE.

Rotating an index REPLACE

Last modified: September 08, 2021

REPLACE works similar to INSERT, but it marks the old document with the same ID as a new document as deleted before inserting a new document.

‹›

SQL
HTTP
PHP
Python
javascript
Java

📋

REPLACE INTO products VALUES(1, "document one", 10);

POST /replace
-H "Content-Type: application/x-ndjson" -d '
{
  "index":"products",
  "id":1,
  "doc":
  {
    "title":"document one",
    "price":10
  }
}
'

$index->replaceDocument([
   'title' => 'document one',
    'price' => 10 
],1);

indexApi.replace({"index" : "products", "id" : 1, "doc" : {"title" : "document one","price":10}})

res = await indexApi.replace({"index" : "products", "id" : 1, "doc" : {"title" : "document one","price":10}});

docRequest = new InsertDocumentRequest();
HashMap<String,Object> doc = new HashMap<String,Object>(){{
            put("title","document one");
            put("price",10);
}};
docRequest.index("products").id(1L).setDoc(doc); 
sqlresult = indexApi.replace(docRequest);

‹›

Response

Query OK, 1 row affected (0.00 sec)

{
  "_index":"products",
  "_id":1,
  "created":false,
  "result":"updated",
  "status":200
}

Array(
    [_index] => products
    [_id] => 1
    [created] => false
    [result] => updated
    [status] => 200
)

{'created': False,
 'found': None,
 'id': 1,
 'index': 'products',
 'result': 'updated'}

{"_index":"products","_id":1,"result":"updated"}

class SuccessResponse {
    index: products
    id: 1
    created: false
    result: updated
    found: null
}

REPLACE is supported for RT and PQ indexes.

The old document is not removed from the index, it is only marked as deleted. Because of this the index size grows until index chunks are merged and documents marked as deleted in these chunks are not included in the chunk created as a result of merge. You can force chunk merge by using OPTIMIZE statement.

The syntax of the REPLACE statement is identical to INSERT syntax:

REPLACE INTO index [(column1, column2, ...)]
    VALUES (value1, value2, ...)
    [, (...)]

REPLACE using HTTP protocol is performed via the /replace endpoint. There's also a synonym endpoint, /index.

Multiple documents can be replaced at once. See bulk adding documents for more details.

‹›

SQL
HTTP
PHP
Python
javascript
Java

📋

REPLACE INTO products(id,title,tag) VALUES (1, 'doc one', 10), (2,' doc two', 20);

POST /bulk
-H "Content-Type: application/x-ndjson" -d '
{ "replace" : { "index" : "products", "id":1, "doc": { "title": "doc one", "tag" : 10 } } }
{ "replace" : { "index" : "products", "id":2, "doc": { "title": "doc two", "tag" : 20 } } }
'

$index->replaceDocuments([
    [   
        'id' => 1,
        'title' => 'document one',
        'tag' => 10 
    ],
    [   
        'id' => 2,
        'title' => 'document one',
        'tag' => 20 
    ]
);

indexApi = manticoresearch.IndexApi(client)
docs = [ \
    {"replace": {"index" : "products", "id" : 1, "doc" : {"title" : "document one"}}}, \
    {"replace": {"index" : "products", "id" : 2, "doc" : {"title" : "document two"}}} ]
api_resp = indexApi.bulk('\n'.join(map(json.dumps,docs)))

docs = [ 
    {"replace": {"index" : "products", "id" : 1, "doc" : {"title" : "document one"}}}, 
    {"replace": {"index" : "products", "id" : 2, "doc" : {"title" : "document two"}}} ];
res =  await indexApi.bulk(docs.map(e=>JSON.stringify(e)).join('\n'));

body = "{\"replace\": {\"index\" : \"products\", \"id\" : 1, \"doc\" : {\"title\" : \"document one\"}}}" +"\n"+ 
    "{\"replace\": {\"index\" : \"products\", \"id\" : 2, \"doc\" : {\"title\" : \"document two\"}}}"+"\n" ;         
indexApi.bulk(body);

‹›

Response

Query OK, 2 rows affected (0.00 sec)

{
  "items":
  [
    {
      "replace":
      {
        "_index":"products",
        "_id":1,
        "created":false,
        "result":"updated",
        "status":200
      }
    },
    {
      "replace":
      {
        "_index":"products",
        "_id":2,
        "created":false,
        "result":"updated",
        "status":200
      }
    }
  ],
  "errors":false
}

Array(
    [items] =>
    Array(
        Array(
            [_index] => products
            [_id] => 2
            [created] => false
            [result] => updated
            [status] => 200 
        )
        Array(
            [_index] => products
            [_id] => 2
            [created] => false
            [result] => updated
            [status] => 200 
        )
    )
    [errors => false
)

{'error': None,
 'items': [{u'replace': {u'_id': 1,
                         u'_index': u'products',
                         u'created': False,
                         u'result': u'updated',
                         u'status': 200}},
           {u'replace': {u'_id': 2,
                         u'_index': u'products',
                         u'created': False,
                         u'result': u'updated',
                         u'status': 200}}]}

{"items":[{"replace":{"_index":"products","_id":1,"created":false,"result":"updated","status":200}},{"replace":{"_index":"products","_id":2,"created":false,"result":"updated","status":200}}],"errors":false}

class BulkResponse {
    items: [{replace={_index=products, _id=1, created=false, result=updated, status=200}}, {replace={_index=products, _id=2, created=false, result=updated, status=200}}]
    error: null
    additionalProperties: {errors=false}
}

REPLACE vs UPDATE UPDATE

Last modified: February 12, 2021